Skip to content

Conversation

@simonrosenberg
Copy link
Collaborator

Summary

This PR creates a single source of truth for all Commit0 benchmark hyperparameters and constant values by introducing benchmarks/commit0/constants.py.

Fixes #365

Changes

New File: benchmarks/commit0/constants.py

Created a centralized constants file containing all hyperparameters:

  • Dataset configuration: DEFAULT_DATASET, DEFAULT_DATASET_SPLIT, DEFAULT_REPO_SPLIT
  • Docker image configuration: DEFAULT_DOCKER_IMAGE_PREFIX, DEFAULT_IMAGE_TAG, CUSTOM_TAG_PREFIX
  • Build configuration: BUILD_TARGET
  • Workspace configuration: WORKSPACE_DIR
  • Git configuration: GIT_BRANCH_NAME, AGENT_BRANCH_NAME
  • Model configuration: DEFAULT_MODEL_NAME
  • Runtime configuration: DEFAULT_RUNTIME_API_URL, DEFAULT_REMOTE_RUNTIME_STARTUP_TIMEOUT, DEFAULT_CONVERSATION_TIMEOUT, DEFAULT_COMMAND_TIMEOUT
  • Evaluation configuration: TOTAL_INSTANCES

Updated Files

  • build_images.py: Updated to import constants from constants.py
  • run_infer.py: Updated to import constants from constants.py
  • eval_infer.py: Updated to import constants from constants.py

New Tests: tests/test_constants.py

Added comprehensive tests for all constants including:

  • Dataset constants validation
  • Docker constants format validation
  • Build constants validation
  • Workspace constants validation
  • Git constants validation
  • Model constants validation
  • Runtime constants validation
  • Evaluation constants validation
  • Integration tests for constructing Docker images, custom tags, workspace paths, and clone commands

Benefits

  1. Single source of truth: All constant values are now defined in one place
  2. Easy to audit: Reviewers can quickly check all hyperparameters in one file
  3. Maintainability: Changes to constants only need to be made in one location
  4. Testability: Constants are now tested to ensure they have valid values

Testing

All tests pass:

pytest benchmarks/commit0/tests/ -v
# 23 passed

@simonrosenberg can click here to continue refining the PR

This commit creates a single source of truth for all Commit0 benchmark
hyperparameters and constant values by introducing constants.py.

Changes:
- Create benchmarks/commit0/constants.py with all constant values:
  - Dataset configuration (DEFAULT_DATASET, DEFAULT_DATASET_SPLIT, DEFAULT_REPO_SPLIT)
  - Docker image configuration (DEFAULT_DOCKER_IMAGE_PREFIX, DEFAULT_IMAGE_TAG, CUSTOM_TAG_PREFIX)
  - Build configuration (BUILD_TARGET)
  - Workspace configuration (WORKSPACE_DIR)
  - Git configuration (GIT_BRANCH_NAME, AGENT_BRANCH_NAME)
  - Model configuration (DEFAULT_MODEL_NAME)
  - Runtime configuration (DEFAULT_RUNTIME_API_URL, timeouts)
  - Evaluation configuration (TOTAL_INSTANCES)

- Update build_images.py to import constants from constants.py
- Update run_infer.py to import constants from constants.py
- Update eval_infer.py to import constants from constants.py
- Add comprehensive tests for constants in test_constants.py

Fixes #365

Co-authored-by: openhands <openhands@all-hands.dev>
Per user request:
- Remove benchmarks/commit0/tests/test_constants.py
- Remove WORKSPACE_DIR constant from constants.py
- Revert all WORKSPACE_DIR usages back to hardcoded '/workspace'

Co-authored-by: openhands <openhands@all-hands.dev>
@openhands-ai
Copy link

openhands-ai bot commented Jan 28, 2026

Looks like there are a few issues preventing this PR from being merged!

  • GitHub Actions are failing:
    • Pre-commit checks

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #370 at branch `openhands/commit0-constants-refactor`

Feel free to include any additional details that might help me get this PR into a better state.

You can manage your notification settings

@simonrosenberg
Copy link
Collaborator Author

@OpenHands fix precommit issues

@openhands-ai
Copy link

openhands-ai bot commented Jan 28, 2026

I'm on it! simonrosenberg can track my progress at all-hands.dev

Co-authored-by: openhands <openhands@all-hands.dev>
@openhands-ai
Copy link

openhands-ai bot commented Jan 28, 2026

I've fixed the pre-commit issues in PR #370. The problem was a ruff formatting issue in benchmarks/commit0/run_infer.py where a long line needed to be split into multiple lines.

Change made:

  • Fixed line formatting in run_infer.py by splitting a long f-string assignment into a multi-line format (as required by ruff formatter)

Verification:

  • Ran all pre-commit hooks (ruff-format, ruff-check, pycodestyle, pyright) - all passed
  • Committed and pushed the fix to the openhands/commit0-constants-refactor branch

The PR should now pass the pre-commit checks in CI.

View full conversation

@simonrosenberg
Copy link
Collaborator Author

@OpenHands
/codereview-roasted
Make sure this PR doesn't alter any constant or behavior
Suggest next step refactoring for cleaner code

@openhands-ai
Copy link

openhands-ai bot commented Jan 28, 2026

I'm on it! simonrosenberg can track my progress at all-hands.dev

@openhands-ai
Copy link

openhands-ai bot commented Jan 28, 2026

Code Review Summary for PR #370

🟢 Taste Rating: Good taste - Elegant, simple solution

Verification: No Constants or Behavior Changed

I verified all 15 extracted constants match their original hardcoded values exactly:

  • Dataset config: DEFAULT_DATASET, DEFAULT_DATASET_SPLIT, DEFAULT_REPO_SPLIT
  • Docker config: DEFAULT_DOCKER_IMAGE_PREFIX, DEFAULT_IMAGE_TAG, CUSTOM_TAG_PREFIX
  • Build/Git config: BUILD_TARGET, GIT_BRANCH_NAME, AGENT_BRANCH_NAME
  • Runtime config: All timeout values and URLs ✅
  • Evaluation config: TOTAL_INSTANCES = 16

This is a pure refactoring with no behavioral changes.


🔧 Improvement Opportunities

  1. Naming Inconsistency - BUILD_TARGET vs DEFAULT_REPO_SPLIT. Consider consistent DEFAULT_ prefix for overridable values.

  2. Incomplete Extraction - /workspace/ path is hardcoded in 8+ places but not extracted.

  3. Magic Numbers Remain - Retry count 5 and timeout increment 100 still hardcoded in run_infer.py.

  4. Semantic Confusion - GIT_BRANCH_NAME vs AGENT_BRANCH_NAME could be clearer as SOURCE_BRANCH_NAME and AGENT_WORKING_BRANCH.


🚀 Suggested Next Steps for Cleaner Code

  1. Extract remaining hardcoded values:

    WORKSPACE_DIR = "/workspace"
    MAX_RETRIES = 5
    RETRY_TIMEOUT_INCREMENT = 100
  2. Add type hints to constants.py for better IDE support

  3. Consider grouping related constants into frozen dataclasses for larger constant sets

  4. Centralize environment variable handling - move os.getenv() calls with defaults into helper functions in constants.py

  5. Add __all__ export list to constants.py for explicit public API


Verdict: ✅ Worth Merging

Core logic is sound, constants are correctly extracted with no behavioral changes. The suggested improvements can be addressed in follow-up PRs.

View full conversation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Regroup all commit0 hyper parameters in a single source of truth benchmarks/commit0/constants.py

3 participants